Enhanced Word Classing for Recurrent Neural Network Language

نویسندگان

Yujing Si

Zhen Zhang

Ta Li

Jielin Pan

Yonghong Yan

چکیده

Recurrent Neural Network Language Model (RNNLM) has recently been shown to outperform conventional N-gram LM as well as many other competing advanced language model techniques. However, the computation complexity of RNNLM is much higher than the conventional N-gram LM. As a result, the Class-based RNNLM (CRNNLM) is usually employed to speed up both the training and testing phase of RNNLM. In previous work with RNNLM, a simply method based on word frequency has been used to derive word classes. In this paper, we take a closer look at the classing and explore to improve the RNNLM performance by enhancing word classing. More specially, we employed bi-gram mutual information clustering, a classical word clustering method which is more accurate, to obtain word classes. Finally, experiments on the standard test set Penn Tree Bank showed that 5%∼7% relative reduction in perplexity (PPL) could be obtained by bigram mutual information clustering method compared to the frequency based word clustering method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhanced word classing for model M

Model M is a superior class-based n-gram model that has shown improvements on a variety of tasks and domains. In previous work with Model M, bigram mutual information clustering has been used to derive word classes. In this paper, we introduce a new word classing method designed to closely match with Model M. The proposed classing technique achieves gains in speech recognition word-error rate o...

متن کامل

Prosodically-enhanced recurrent neural network language models

Recurrent neural network language models have been shown to consistently reduce the word error rates (WERs) of large vocabulary speech recognition tasks. In this work we propose to enhance the RNNLMs with prosodic features computed using the context of the current word. Since it is plausible to compute the prosody features at the word and syllable level we have trained the models on prosody fea...

متن کامل

A Study on Neural Network Language Modeling

An exhaustive study on neural network language modeling (NNLM) is performed in this paper. Different architectures of basic neural network language models are described and examined. A number of different improvements over basic neural network language models, including importance sampling, word classes, caching and bidirectional recurrent neural network (BiRNN), are studied separately, and the...

متن کامل

Restricted Recurrent Neural Tensor Networks

Increasing the capacity of recurrent neural networks (RNN) usually involves augmenting the size of the hidden layer, resulting in a significant increase of computational cost. An alternative is the recurrent neural tensor network (RNTN), which increases capacity by employing distinct hidden layer weights for each vocabulary word. However, memory usage scales linearly with vocabulary size, which...

متن کامل

Sequential Recurrent Neural Networks for Language Modeling

Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network. This paper presents a novel approach, which bridges the gap between these two categories of networks. In partic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Enhanced Word Classing for Recurrent Neural Network Language

نویسندگان

چکیده

منابع مشابه

Enhanced word classing for model M

Prosodically-enhanced recurrent neural network language models

A Study on Neural Network Language Modeling

Restricted Recurrent Neural Tensor Networks

Sequential Recurrent Neural Networks for Language Modeling

عنوان ژورنال:

اشتراک گذاری